Variable selection in discriminant partial least-squares analysis.
نویسندگان
چکیده
Variable selection enhances the understanding and interpretability of multivariate classification models. A new chemometric method based on the selection of the most important variables in discriminant partial least-squares (VS-DPLS) analysis is described. The suggested method is a simple extension of DPLS where a small number of elements in the weight vector w is retained for each factor. The optimal number of DPLS factors is determined by cross-validation. The new algorithm is applied to four different high-dimensional spectral data sets with excellent results. Spectral profiles from Fourier transform infrared spectroscopy and pyrolysis mass spectrometry are used. To investigate the uniqueness of the selected variables an iterative VS-DPLS procedure is performed. At each iteration, the previously found selected variables are removed to see if a new VS-DPLS classification model can be constructed using a different set of variables. In this manner, it is possible to determine regions rather than individual variables that are important for a successful classification.
منابع مشابه
Chemometric Studies for Quality Control of Processed Brazilian Coffees Using Drifts
In this work, the potential of mid-infrared diffuse reflectance spectroscopy with Fourier transform for discrimination of 29 commercial Brazilian coffee samples with different industrial processing, i.e., caffeine extraction and roasting degree, was evaluated. The statistical treatments applied to pretreated spectral data were principal component analysis and partial least squares – discriminan...
متن کاملMultivariate Classifi cation for Qualitative Analysis
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Principles of classifi cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 The classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Main categories of classifi cation methods ...
متن کاملSoil type recognition as improved by genetic algorithm-based variable selection using near infrared spectroscopy and partial least squares discriminant analysis
Soil types have traditionally been determined by soil physical and chemical properties, diagnostic horizons and pedogenic processes based on a given classification system. This is a laborious and time consuming process. Near infrared (NIR) spectroscopy can comprehensively characterize soil properties, and may provide a viable alternative method for soil type recognition. Here, we presented a pa...
متن کاملTitle: Using machine learning methods to predict experimental high- throughput screening data
High-throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar dr...
متن کاملChemometric Feature Selection and Classification of Ganoderma lucidum Spores and Fruiting Body Using ATR-FTIR Spectroscopy
Ganoderma lucidum (G. lucidum) spores as a valuable Chinese herbal medicine have vast marketable prospect for its bioactivities and medicinal efficacy. This study aims at the development of an effective and simple analytical method to distinguish G. lucidum spores from its fruiting body, which is of essential importance for the quality control and fast discrimination of raw materials of Chinese...
متن کاملVariable Selection and Parameter Tuning in High-Dimensional Prediction
In the context of classification using high-dimensional data such as microarray gene expression data, it is often useful to perform preliminary variable selection. For example, the k-nearest-neighbors classification procedure yields a much higher accuracy when applied on variables with high discriminatory power. Typical (univariate) variable selection methods for binary classification are, e.g....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Analytical chemistry
دوره 70 19 شماره
صفحات -
تاریخ انتشار 1998